Microphone Speaker Analysis: Audio
Segmentation and Frequency Insights
Taisia-Maria Coconu[1],
Costin-Alexandru
Deonise1,
Constantin AngheL[2], Cătălin
Negru1, Florin Pop1,[3],[4]
Abstract. Audio segmentation represents a technical process
used for separating a stream of audio recordings, which frequently contain
multiple speakers, into uniform sections. This paper explores the
implementation of voice-dialing and recognition algorithms to examine and
analyze the technology's capability to accurately identify and differentiate
speakers in intricate environments. It aims to enhance our understanding of the
technology's functionality, including its ability to discern speakers' emotions
and gender. Additionally, a hardware simulation is conducted using a two-way
microphone and an Arduino board. It seeks to emphasize precision in speaker
recognition and diarization, along with the accurate
transcription of speeches, by achieving optimal parameters and enhancing
existing market models. It also explores the applicability of this technology
in various fields by creating applications that mainly use Speech Diarization and Speech Recognition.
Keywords: Emotion Detection, Gender Detection, Voice
Recognition Hardware System.
DOI 10.56082/annalsarsciinfo.2024.1.5
[1] National University of Science and Technology Politehnica Bucharest, Romania
[2] National Institute of Research and Development in Mechatronics and Measurement Technique, Bucharest, Romania
[3] National Institute for Research and Development in Informatics (ICI), Bucharest, Romania
[4] Academy of Romanian Scientists, Bucharest, Romania